Learning from Incomplete Data
نویسندگان
چکیده
Real-world learning tasks often involve high-dimensional data sets with complex patterns of missing features. In this paper we review the problem of learning from incomplete data from two statistical perspectives|the likelihood-based and the Bayesian. The goal is twofold: to place current neural network approaches to missing data within a statistical framework, and to describe a set of algorithms, derived from the likelihood-based framework, that handle clustering , classiication, and function approximation from incomplete data in a principled and eecient manner. These algorithms are based on mixture modeling and make two distinct appeals to the Expectation-Maximization (EM) principle (Dempster et al., 1977)|both for the estimation of mixture components and for coping with the missing data.
منابع مشابه
Ensemble-based Top-k Recommender System Considering Incomplete Data
Recommender systems have been widely used in e-commerce applications. They are a subclass of information filtering system, used to either predict whether a user will prefer an item (prediction problem) or identify a set of k items that will be user-interest (Top-k recommendation problem). Demanding sufficient ratings to make robust predictions and suggesting qualified recommendations are two si...
متن کاملMining from incomplete quantitative data by fuzzy rough sets
Machine learning can extract desired knowledge from existing training examples and ease the development bottleneck in building expert systems. Most learning approaches derive rules from complete data sets. If some attribute values are unknown in a data set, it is called incomplete. Learning from incomplete data sets is usually more difficult than learning from complete data sets. In the past, t...
متن کاملLearning to Classify Incomplete Examples
Most research on supervised learning assumes the attributes of training and test examples are completely speciied. Real-world data, however, is often incomplete. This paper studies the task of learning to classify incomplete test examples, given incomplete (resp., complete) training data. We rst show that the performance task of classifying incomplete examples requires the use of default classi...
متن کاملStructure Learning of Probabilistic Relational Models from Incomplete Relational Data
Existing relational learning approaches usually work on complete relational data, but real-world data are often incomplete. This paper proposes the MGDA approach to learn structures of probabilistic relational model (PRM) from incomplete relational data. The missing values are filled in randomly at first, and a maximum likelihood tree (MLT) is generated from the complete data sample. Then, Gibb...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1995